Image Talk: A Real Time Synthetic Talking Head Using One Single Image with Chinese Text-To-Speech Capability
نویسندگان
چکیده
Image Talk uses a single image to automatically create talking sequences in real time. The image can be acquired from a photograph, video clip, or hand drawn characters. This interactive system accepts Chinese text and talks back in Mandarin Chinese, generating facial expression in real-time. Image Talk analyzes Chinese text by converting it to a standard Pinyin system used in Taiwan and fetches the associated facial expressions from an expression pool. The expressions are synchronized with the sound and played back in a talking sequence. Image Talk also incorporates eye blinking, small-scale head rotation and translation perturbations, to make the resulting sequence more natural. It is also easy to switch to any other face images. The result is quite entertaining, and can easily be used as a new human machine interface, as well as for lip sync in computer animated characters.
منابع مشابه
A Speech Driven Talking Head System Based on a Single Face Image
In this paper, a lifelike talking head system is proposed. The talking head, which is driven by speaker independent speech recognition, requires only one single face image to synthesize lifelike facial expression. The proposed system uses speech recognition engines to get utterances and corresponding time stamps in the speech data. Associated facial expressions can be fetched from an expression...
متن کاملText2Video: Text-Driven Facial Animation using MPEG-4
We present a complete system for the automatic creation of talking head video sequences from text messages. Our system converts the text into MPEG-4 Facial Animation Parameters and synthetic voice. A user selected 3D character will perform lip movements synchronized to the speech data. The 3D models created from a single image vary from realistic people to cartoon characters. A voice selection ...
متن کاملAn investigation into the generation of mouth shapes for a talking head
BT is currently developing a low computation, real time, talking head as an adjunct to the Laureate text-to-speech system[1]. Research into the development of a talking head may be divided into two components; image generation, and face and head movement control. This paper concentrates on the last of the two. A significant aspect of this work is research into methods of generating convincing m...
متن کاملPhoto-Realistic Talking-Heads from Image Samples
This paper describes a system for creating a photo-realistic model of the human head that can be animated and lip-synched from phonetic transcripts of text. Combined with a state-of-the-art text-to-speech synthesizer (TTS), it generates video animations of talking heads that closely resemble real people. To obtain a naturally looking head, we choose a “data-driven” approach. We record a talking...
متن کاملReal-time streaming for the animation of talking faces in multiuser environments
In order to enable face animation on the Internet using high quality synthetic speech, the Text-to-Speech (TTS) servers need to be implemented on network-based servers and shared by many users. The output of a TTS server is used to animate talking heads as defined in MPEG-4. The TTS server creates two sets of data: audio data and Phonemes with optional Facial Animation Parameters (FAP) like smi...
متن کامل